635. Design Log Storage System
Medium

You are given several logs, where each log contains a unique ID and timestamp. Timestamp is a string that has the following format: Year:Month:Day:Hour:Minute:Second, for example, 2017:01:01:23:59:59. All domains are zero-padded decimal numbers.

Implement the LogSystem class:

  • LogSystem() Initializes the LogSystem object.
  • void put(int id, string timestamp) Stores the given log (id, timestamp) in your storage system.
  • int[] retrieve(string start, string end, string granularity) Returns the IDs of the logs whose timestamps are within the range from start to end inclusive. start and end all have the same format as timestamp, and granularity means how precise the range should be (i.e. to the exact Day, Minute, etc.). For example, start = "2017:01:01:23:59:59", end = "2017:01:02:23:59:59", and granularity = "Day" means that we need to find the logs within the inclusive range from Jan. 1st 2017 to Jan. 2nd 2017, and the Hour, Minute, and Second for each log entry can be ignored.

 

Example 1:

Input
["LogSystem", "put", "put", "put", "retrieve", "retrieve"]
[[], [1, "2017:01:01:23:59:59"], [2, "2017:01:01:22:59:59"], [3, "2016:01:01:00:00:00"], ["2016:01:01:01:01:01", "2017:01:01:23:00:00", "Year"], ["2016:01:01:01:01:01", "2017:01:01:23:00:00", "Hour"]]
Output
[null, null, null, null, [3, 2, 1], [2, 1]]

Explanation
LogSystem logSystem = new LogSystem();
logSystem.put(1, "2017:01:01:23:59:59");
logSystem.put(2, "2017:01:01:22:59:59");
logSystem.put(3, "2016:01:01:00:00:00");

// return [3,2,1], because you need to return all logs between 2016 and 2017.
logSystem.retrieve("2016:01:01:01:01:01", "2017:01:01:23:00:00", "Year");

// return [2,1], because you need to return all logs between Jan. 1, 2016 01:XX:XX and Jan. 1, 2017 23:XX:XX.
// Log 3 is not returned because Jan. 1, 2016 00:00:00 comes before the start of the range.
logSystem.retrieve("2016:01:01:01:01:01", "2017:01:01:23:00:00", "Hour");

 

Constraints:

  • 1 <= id <= 500
  • 2000 <= Year <= 2017
  • 1 <= Month <= 12
  • 1 <= Day <= 31
  • 0 <= Hour <= 23
  • 0 <= Minute, Second <= 59
  • granularity is one of the values ["Year", "Month", "Day", "Hour", "Minute", "Second"].
  • At most 500 calls will be made to put and retrieve.
Accepted
23,589
Submissions
38,943
Seen this question in a real interview before?
Companies
Related Topics
Similar Questions

Average Rating: 3.56 (18 votes)

Premium

Solution


Approach #1 Converting timestamp into a number [Accepted]

This solution is based on converting the given timestap into a number. This can help because retrieval of Logs lying within a current range can be very easily done if the range to be used can be represented in the form of a single number. Let's look at the individual implementations directly.

  1. put: Firstly, we split the given timestamp based on : and store the individual components obtained into an stst array. Now, in order to put this log's entry into the storage, firstly, we convert this timestamp, now available as individual components in the stst array into a single number. To obtain a number which is unique for each timestamp, the number chosen is such that it represents the timestamp in terms of seconds. But, doing so for the Year values can lead to very large numbers, which could lead to a potential overflow. Since, we know that the Year's value can start from 2000 only, we subtract 1999 from the Year's value before doing the conversion of the given timestamp into seconds. We store this timestamp(in the form of a number now), along with the Log's id, in s listlist, which stores data in the form [converted_timestamp,id][converted\_timestamp, id].

  2. retrieve: In order to retrieve the logs' ids lying within the timestamp ss and ee, with a granularity gragra, firstly, we need to convert the given timestamps into seconds. But, since, we need to take care of granularity, before doing the conversion, we need to consider the effect of granularity. Granularity, in a way, depicts the precision of the results. Thus, for a granularity corresponding to a Day, we need to consider the portion of the timestamp considering the precision upto Day only. The rest of the fields can be assumed to be all 0's. After doing this for both the start and end timestamp, we do the conversion of the timestamp with the required precision into seconds. Once this has been done, we iterate over all the logs in the listlist to obtain the ids of those logs which lie within the required range. We keep on adding these ids to a resres list which is returned at the end of this function call.

**Performance Analysis**
  • The putmethod takes O(1)O(1) time to insert a new entry into the given set of logs.

  • The retrieve method takes O(n)O(n) time to retrieve the logs in the required range. Determining the granularity takes O(1)O(1) time. But, to find the logs lying in the required range, we need to iterate over the set of logs atleast once. Here, nn refers to the number of entries in the current set of logs.


Approach #2 Better Retrieval [Accepted]

This method remains almost the same as the last approach, except that, in order to store the timestamp data, we make use of a TreeMap instead of a list, with the key values being the timestamps converted in seconds form and the values being the ids of the corresponding logs.

Thus, the put method remains the same as the last approach. However, we can save some time in retrieve approach by making use of tailMap function of TreeMap, which stores the entries in the form of a sorted navigable binary tree. By making use of this function, instead of iterating over all the timestamps in the storage to find the timestamps lying within the required range(after considering the granularity as in the last approach), we can reduce the search space to only those elements whose timestamp is larger than(or equal to) the starting timestamp value.

**Performance Analysis**
  • The TreeMap is maintained internally as a Red-Black(balanced) tree. Thus, the putmethod takes O(log(n))O\big(log(n)\big) time to insert a new entry into the given set of logs. Here, nn refers to the number of entries currently present in the given set of logs.

  • The retrieve method takes O(mstart)O(m_{start}) time to retrieve the logs in the required range. Determining the granularity takes O(1)O(1) time. To find the logs in the required range, we only need to iterate over those elements which already lie in the required range. Here, mstartm_{start} refers to the number of entries in the current set of logs which have a timestamp greater than the current startstart value.

Report Article Issue

Comments: 15
saikrishnatvt's avatar
Read More

What is the need to convert the time stamp into seconds?
Why cant we straight away use it? For instance, why can't we represent the timestamp 2019:03:04:15:16:18 as 20190304151618 (or 190304151618)?

41
Show 2 replies
Reply
Share
Report
Explorer123's avatar
Read More

how about using subMap function TreeMap? In that case, the complexity will be reduced to O(logN + k), where k is the number of elements within range and N is the total number of elements in treeMap

17
Reply
Share
Report
llalies's avatar
Read More

Can you please explain the following lines?

        st[1] = st[1] - (st[1] == 0 ? 0 : 1);
        st[2] = st[2] - (st[2] == 0 ? 0 : 1);

15
Show 3 replies
Reply
Share
Report
TheKarateKid's avatar
Read More

Wouldn't it be much simpler to convert the timestamp to its Unix epoch value and store that? These solutions emphasize designing/converting a Timestamp storage unit rather than a Log Storage system.

4
Reply
Share
Report
neham's avatar
mzchen's avatar
Read More

The code is wrong for case:
["LogSystem","put","put","put","retrieve"]
[[],[1,"2017:01:31:23:59:59"],[2,"2017:02:01:23:59:59"],[3,"2017:01:31:00:00:00"],["2017:01:30:00:00:00","2017:02:01:00:00:00","Hour"]]

It returns [3] which indeed should be [1,3].

2
Show 1 reply
Reply
Share
Report
gotogosub's avatar
Read More

Whoever wrote the java solution does not know java formatting conventions

1
Reply
Share
Report
zhdv's avatar
Read More

A small note to solutions based on converting the timestamp into seconds - the author do implicitly takes into account that there are leap years but (worth explicit mentioning, though) but forgot about leap seconds. Basically, it is technically possible to have "23:59:60" (https://en.wikipedia.org/wiki/Leap_second).

From my point of view, taking these approaches at the interview might be risky....

1
Reply
Share
Report
whyseahike's avatar
Read More

2017:01:31:23:59:59 can also represented as 180131235959, this way can save lots of calculation at put()/retrieval() where only year needs a subtract.

1
Show 3 replies
Reply
Share
Report
tofunmi's avatar
Read More

Did anyone attempt this question by using the Trie data structure?

0
Reply
Share
Report

You don't have any submissions yet.

/23
Contribute
|||
Saved
Trie
This list is empty.